Noise Estimation based on Entropy without using VAD for Speech Enhancement
نویسنده
چکیده
A practical speech enhancement system consists of two major components, the estimation of noise power spectrum, and the estimation of speech.In single channel speech enhancement systems, most algorithms require an estimation of average noise spectrum since a secondary channel is not available. This requires a reliable speech/silence detector. Thus the speech/silence detection can be a determining factor for the performance of the whole speech enhancement system. The speech/silence detection finds out the frames of the noisy speech that contain only noise. If the speech/silence detection is not accurate then speech echoes and residual noise tend to be present in the enhanced speech. The performance of noise estimation algorithm is usually a tradeoff between speech distortion and noise reduction. In existing methods, noise is estimated only during speech pauses and these pauses are identified using Voice Activity Detector (VAD). This paper describes novel noise estimation method to estimate noise in non-stationary environments. This approach uses an algorithm that classifies noisy speech signal into pure speech, quasi speech and non-speech frames based on adaptive thresholds without using of VAD.Speech presence is determined by computing the ratio of the noisy speech power spectrum to its local minimum, which is computed by averaging past values of the noisy speech power spectra with a look-ahead factor. To evaluate proposed method performance, segmental SNR as evaluation criteria and compared with weighted average noise estimation method. The simulation results of the proposed algorithm shows better performance than conventional methods.
منابع مشابه
Speech Enhancement Using Gaussian Mixture Models, Explicit Bayesian Estimation and Wiener Filtering
Gaussian Mixture Models (GMMs) of power spectral densities of speech and noise are used with explicit Bayesian estimations in Wiener filtering of noisy speech. No assumption is made on the nature or stationarity of the noise. No voice activity detection (VAD) or any other means is employed to estimate the input SNR. The GMM mean vectors are used to form sets of over-determined system of equatio...
متن کاملA New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain
Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...
متن کاملA priori SNR estimation and noise estimation for speech enhancement
A priori signal-to-noise ratio (SNR) estimation and noise estimation are important for speech enhancement. In this paper, a novel modified decision-directed (DD) a priori SNR estimation approach based on single-frequency entropy, named DDBSE, is proposed. DDBSE replaces the fixed weighting factor in the DD approach with an adaptive one calculated according to change of single-frequency entropy....
متن کاملA Fast Convergence Speech Enhancement Method
A fast convergence speech enhancement method is proposed in this paper. The noise estimation acceleration technique is applied to the conventional statistical model based algorithm to shorten the convergence time after the sudden change of noise intensity. First, the burst detection of power spectrum is performed on the noisy spectrum. Next, the loglikelihood ratio (LLR) based VAD is used in th...
متن کاملSpeech enhancement based on hidden Markov model using sparse code shrinkage
This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...
متن کامل